An Efficient Japanese Parsing Algorithm for Computer-Assisted Language Learning

نویسندگان

  • Chi-Hong Leung
  • Wun-Na Yung
چکیده

Instructional grammar is often used in Computer-assisted Language Learning (CALL) and the grammatical error detection is an important feature. However, it is not an easy task in Japanese language. There is no delimiter separating consecutive words in Japanese sentences. Word segmentation is a process in which proper word boundaries are identified. Before syntactic parsing of a Japanese sentence, word segmentation has to be performed. Traditionally, the word segmentation is often followed by the syntactic parsing. An algorithm in which the Japanese word segmentation and syntactic parsing are combined into one process can increase the overall efficiency. Computer Assisted Language Learning (CALL) is related to the use of computers for language teaching and learning. Higgins [5] described three different models of grammar teaching: instructional, revelatory and conjectural. Instructional grammar is often used in CALL because of being computerized easily. Japanese characters are often grouped together to form words which are considered the basic syntactic and semantic units in Japanese. But when words are placed together to form a sentence, it is a character string without any indication of which sub-string as a word. In order to achieve the objective of word segmentation, a number of methods have been attempted by researchers [4,7,8,9,11]. To examine the syntactic structure of a sentence, we need the grammar and the parsing technique [6,10]. According to the language classification of Chomsky [1], there are context-free grammar, context sensitive grammar, and unrestricted grammar. The context-free grammar is a very important class of grammars because it is powerful enough to be able to describe most of the structure in natural languages. The Earley parser [3] was designed for the context-free grammar. Some examples can be found in the works of Church [2] and Wang [12]. The parser is divided into three parts: predictor, scanner and completer. In the proposed approach, the scanner is modified to segment a word that can match the terminal symbol. Hence, only segmentations that can lead to correct parses will be generated to reduce the processing time. An experiment was performed to evaluate the efficiency of the algorithm. The test data used for this experiment were derived from the corpus Nihon Keizai Shimbun (1993-1994) made available by the Linguistic Data Consortium, University of Pennsylvania. The average number of states generated for a sentence in these two different approaches are calculated. The new approach generated 5,289.45 states while the traditional approach generated 648,592.25 states. It is proved that the algorithm of word segmentation associated with syntactic parsing mentioned in this paper can increase the processing speed significantly. References [1] Chomsky, N., Syntactic Structures, Mouton, The Hague,

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An Efficient Chinese Parsing Algorithm for Computer-Assisted Language Learning

Instructional grammar is often used in Computer-assisted Language Learning (CALL) and the grammatical error detection is an important feature. However, it is not an easy task in Chinese language. There is no delimiter separating consecutive words in Chinese sentences. Word segmentation is a process in which proper word boundaries are identified. Before syntactic parsing of a Chinese sentence, w...

متن کامل

The Effect of Computer Assisted Cooperative Language Learning on Iranian High School Students' Language Anxiety and Reading Comprehension

This study explored the effectiveness of the two computer-assisted modes: cooperative and individual on improving Iranian high school students’ reading comprehension. It was also concerned with investigating the effectiveness of the two computer-assisted modes on the participants’ foreign language learning anxiety (FLLA). The sample of the study consisted of two intact groups, each containing 2...

متن کامل

The Effect of Computer Assisted Cooperative Language Learning on Iranian High School Students' Language Anxiety and Reading Comprehension

This study explored the effectiveness of the two computer-assisted modes: cooperative and individual on improving Iranian high school students’ reading comprehension. It was also concerned with investigating the effectiveness of the two computer-assisted modes on the participants’ foreign language learning anxiety (FLLA). The sample of the study consisted of two intact groups, each containing 2...

متن کامل

Teaching approaches to Computer Assisted Language Learning

Computers have been used for language teaching ever since the 1960's.Learning a second language is a challenging endeavor, and, for decades now, proponents of computer assisted language learning (CALL) have declared that help is on the horison. We investigate the suitability of deploying speech technology in computer based systems that can be used to teach foreign language skills. In this case,...

متن کامل

Iranian EFL Learners’ Perception of the Efficacy and Affordance of Activity Theory-based Computer Assisted Language Learning in Writing Achievement

Second language writing instruction has been greatly influenced by the growing importance of technology and the recent shift of paradigm from a cognitive to a social orientation in second language acquisition (Lantolf & Thorne, 2006). Therefore, the applications of computer assisted language learning and activity theory have been suggested as a promising framework for writing studies. The prese...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003